Overview

Dataset statistics

Number of variables18
Number of observations299
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory42.2 KiB
Average record size in memory144.4 B

Variable types

NUM9
BOOL7
DATE1
CAT1

Reproduction

Analysis started2020-11-05 16:43:30.711185
Analysis finished2020-11-05 16:43:47.136159
Duration16.42 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

age
Real number (ℝ≥0)

Distinct47
Distinct (%)15.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60.83389298
Minimum40
Maximum95
Zeros0
Zeros (%)0.0%
Memory size2.3 KiB
2020-11-05T17:43:47.379046image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile42.9
Q151
median60
Q370
95-th percentile82
Maximum95
Range55
Interquartile range (IQR)19

Descriptive statistics

Standard deviation11.89480907
Coefficient of variation (CV)0.1955293093
Kurtosis-0.184870532
Mean60.83389298
Median Absolute Deviation (MAD)10
Skewness0.4230619067
Sum18189.334
Variance141.4864829
MonotocityNot monotonic
2020-11-05T17:43:47.565852image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%) 
603311.0%
 
50279.0%
 
65268.7%
 
70258.4%
 
45196.4%
 
55175.7%
 
75113.7%
 
58103.3%
 
53103.3%
 
6382.7%
 
Other values (37)11337.8%
 
ValueCountFrequency (%) 
4072.3%
 
4110.3%
 
4272.3%
 
4310.3%
 
4420.7%
 
ValueCountFrequency (%) 
9520.7%
 
9410.3%
 
9031.0%
 
8710.3%
 
8610.3%
 

anaemia
Boolean

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
0
170 
1
129 
ValueCountFrequency (%) 
017056.9%
 
112943.1%
 
2020-11-05T17:43:47.698292image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

creatinine_phosphokinase
Real number (ℝ≥0)

Distinct208
Distinct (%)69.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean581.8394649
Minimum23
Maximum7861
Zeros0
Zeros (%)0.0%
Memory size2.3 KiB
2020-11-05T17:43:47.863894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile59
Q1116.5
median250
Q3582
95-th percentile2263
Maximum7861
Range7838
Interquartile range (IQR)465.5

Descriptive statistics

Standard deviation970.2878807
Coefficient of variation (CV)1.667621293
Kurtosis25.1490462
Mean581.8394649
Median Absolute Deviation (MAD)182
Skewness4.463110085
Sum173970
Variance941458.5715
MonotocityNot monotonic
2020-11-05T17:43:48.110136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
5824715.7%
 
6641.3%
 
12941.3%
 
8431.0%
 
23131.0%
 
11531.0%
 
5931.0%
 
6831.0%
 
6031.0%
 
6931.0%
 
Other values (198)22374.6%
 
ValueCountFrequency (%) 
2310.3%
 
3010.3%
 
4731.0%
 
5210.3%
 
5310.3%
 
ValueCountFrequency (%) 
786110.3%
 
770210.3%
 
588210.3%
 
520910.3%
 
454010.3%
 

diabetes
Boolean

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
0
174 
1
125 
ValueCountFrequency (%) 
017458.2%
 
112541.8%
 
2020-11-05T17:43:48.331715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

ejection_fraction
Real number (ℝ≥0)

Distinct17
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.08361204
Minimum14
Maximum80
Zeros0
Zeros (%)0.0%
Memory size2.3 KiB
2020-11-05T17:43:48.434022image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile20
Q130
median38
Q345
95-th percentile60
Maximum80
Range66
Interquartile range (IQR)15

Descriptive statistics

Standard deviation11.83484074
Coefficient of variation (CV)0.3107594082
Kurtosis0.04140935982
Mean38.08361204
Median Absolute Deviation (MAD)8
Skewness0.5553827517
Sum11387
Variance140.0634554
MonotocityNot monotonic
2020-11-05T17:43:48.614877image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%) 
354916.4%
 
384013.4%
 
403712.4%
 
253612.0%
 
303411.4%
 
603110.4%
 
50217.0%
 
45206.7%
 
20186.0%
 
5531.0%
 
Other values (7)103.3%
 
ValueCountFrequency (%) 
1410.3%
 
1520.7%
 
1720.7%
 
20186.0%
 
253612.0%
 
ValueCountFrequency (%) 
8010.3%
 
7010.3%
 
6510.3%
 
6220.7%
 
603110.4%
 
Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
0
194 
1
105 
ValueCountFrequency (%) 
019464.9%
 
110535.1%
 
2020-11-05T17:43:48.767162image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

platelets
Real number (ℝ≥0)

Distinct176
Distinct (%)58.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean263358.0293
Minimum25100
Maximum850000
Zeros0
Zeros (%)0.0%
Memory size2.3 KiB
2020-11-05T17:43:48.923949image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum25100
5-th percentile131800
Q1212500
median262000
Q3303500
95-th percentile422500
Maximum850000
Range824900
Interquartile range (IQR)91000

Descriptive statistics

Standard deviation97804.23687
Coefficient of variation (CV)0.3713736663
Kurtosis6.209254515
Mean263358.0293
Median Absolute Deviation (MAD)44000
Skewness1.462320838
Sum78744050.75
Variance9565668749
MonotocityNot monotonic
2020-11-05T17:43:49.187117image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
263358.03258.4%
 
27100041.3%
 
22100041.3%
 
25500041.3%
 
22800041.3%
 
23500041.3%
 
27900041.3%
 
23700041.3%
 
30500041.3%
 
22600041.3%
 
Other values (166)23879.6%
 
ValueCountFrequency (%) 
2510010.3%
 
4700010.3%
 
5100010.3%
 
6200010.3%
 
7000010.3%
 
ValueCountFrequency (%) 
85000010.3%
 
74200010.3%
 
62100010.3%
 
54300010.3%
 
53300010.3%
 

serum_creatinine
Real number (ℝ≥0)

Distinct40
Distinct (%)13.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.393879599
Minimum0.5
Maximum9.4
Zeros0
Zeros (%)0.0%
Memory size2.3 KiB
2020-11-05T17:43:49.399241image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.5
5-th percentile0.7
Q10.9
median1.1
Q31.4
95-th percentile3
Maximum9.4
Range8.9
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation1.034510064
Coefficient of variation (CV)0.7421803613
Kurtosis25.82823866
Mean1.393879599
Median Absolute Deviation (MAD)0.2
Skewness4.455995882
Sum416.77
Variance1.070211073
MonotocityNot monotonic
2020-11-05T17:43:49.666665image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%) 
15016.7%
 
0.93210.7%
 
1.13210.7%
 
1.2248.0%
 
0.8248.0%
 
1.3206.7%
 
0.7196.4%
 
1.18113.7%
 
1.493.0%
 
1.793.0%
 
Other values (30)6923.1%
 
ValueCountFrequency (%) 
0.510.3%
 
0.641.3%
 
0.7196.4%
 
0.7510.3%
 
0.8248.0%
 
ValueCountFrequency (%) 
9.410.3%
 
910.3%
 
6.810.3%
 
6.110.3%
 
5.810.3%
 

serum_sodium
Real number (ℝ≥0)

Distinct27
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean136.6254181
Minimum113
Maximum148
Zeros0
Zeros (%)0.0%
Memory size2.3 KiB
2020-11-05T17:43:49.937932image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum113
5-th percentile130
Q1134
median137
Q3140
95-th percentile144
Maximum148
Range35
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.412477284
Coefficient of variation (CV)0.03229616675
Kurtosis4.119712008
Mean136.6254181
Median Absolute Deviation (MAD)3
Skewness-1.048136016
Sum40851
Variance19.46995578
MonotocityNot monotonic
2020-11-05T17:43:50.177134image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%) 
1364013.4%
 
1373812.7%
 
1403511.7%
 
1343210.7%
 
138237.7%
 
139227.4%
 
135165.4%
 
132144.7%
 
141124.0%
 
142113.7%
 
Other values (17)5618.7%
 
ValueCountFrequency (%) 
11310.3%
 
11610.3%
 
12110.3%
 
12410.3%
 
12510.3%
 
ValueCountFrequency (%) 
14810.3%
 
14610.3%
 
14593.0%
 
14451.7%
 
14331.0%
 

sex
Boolean

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
1
194 
0
105 
ValueCountFrequency (%) 
119464.9%
 
010535.1%
 
2020-11-05T17:43:50.597683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

smoking
Boolean

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
0
203 
1
96 
ValueCountFrequency (%) 
020367.9%
 
19632.1%
 
2020-11-05T17:43:50.688516image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

time
Real number (ℝ≥0)

Distinct148
Distinct (%)49.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean130.2608696
Minimum4
Maximum285
Zeros0
Zeros (%)0.0%
Memory size2.3 KiB
2020-11-05T17:43:50.917014image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile12.9
Q173
median115
Q3203
95-th percentile250
Maximum285
Range281
Interquartile range (IQR)130

Descriptive statistics

Standard deviation77.61420795
Coefficient of variation (CV)0.5958367099
Kurtosis-1.212047967
Mean130.2608696
Median Absolute Deviation (MAD)71
Skewness0.1278026456
Sum38948
Variance6023.965276
MonotocityIncreasing
2020-11-05T17:43:51.132747image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
18772.3%
 
25072.3%
 
18662.0%
 
10762.0%
 
1062.0%
 
24451.7%
 
9551.7%
 
14651.7%
 
20951.7%
 
24551.7%
 
Other values (138)24280.9%
 
ValueCountFrequency (%) 
410.3%
 
610.3%
 
720.7%
 
820.7%
 
1062.0%
 
ValueCountFrequency (%) 
28510.3%
 
28010.3%
 
27810.3%
 
27110.3%
 
27020.7%
 
Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
NO
203 
YES
96 
ValueCountFrequency (%) 
NO20367.9%
 
YES9632.1%
 
2020-11-05T17:43:51.400154image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Day
Real number (ℝ≥0)

Distinct31
Distinct (%)10.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.88294314
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size2.3 KiB
2020-11-05T17:43:51.648681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q19
median18
Q326
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)17

Descriptive statistics

Standard deviation9.488407447
Coefficient of variation (CV)0.5620114553
Kurtosis-1.297293474
Mean16.88294314
Median Absolute Deviation (MAD)8
Skewness-0.1735381674
Sum5048
Variance90.02987587
MonotocityNot monotonic
2020-11-05T17:43:51.893213image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%) 
30227.4%
 
26186.0%
 
4165.4%
 
3155.0%
 
24155.0%
 
13144.7%
 
29144.7%
 
27144.7%
 
18124.0%
 
1124.0%
 
Other values (21)14749.2%
 
ValueCountFrequency (%) 
1124.0%
 
272.3%
 
3155.0%
 
4165.4%
 
551.7%
 
ValueCountFrequency (%) 
3162.0%
 
30227.4%
 
29144.7%
 
2851.7%
 
27144.7%
 

Month
Real number (ℝ≥0)

Distinct10
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.598662207
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size2.3 KiB
2020-11-05T17:43:52.179620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q16
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.600877153
Coefficient of variation (CV)0.3422809281
Kurtosis-0.9699906544
Mean7.598662207
Median Absolute Deviation (MAD)2
Skewness0.05845493244
Sum2272
Variance6.764561963
MonotocityNot monotonic
2020-11-05T17:43:52.438456image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
105518.4%
 
65217.4%
 
74615.4%
 
44013.4%
 
12248.0%
 
5248.0%
 
11217.0%
 
8206.7%
 
9144.7%
 
131.0%
 
ValueCountFrequency (%) 
131.0%
 
44013.4%
 
5248.0%
 
65217.4%
 
74615.4%
 
ValueCountFrequency (%) 
12248.0%
 
11217.0%
 
105518.4%
 
9144.7%
 
8206.7%
 

Year
Categorical

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2016
296 
2015
 
3
ValueCountFrequency (%) 
201629699.0%
 
201531.0%
 
2020-11-05T17:43:52.703280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-05T17:43:52.796491image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:52.895320image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length4
Median length4
Mean length4
Min length4

Date
Date

Distinct148
Distinct (%)49.5%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
Minimum2015-01-03 00:00:00
Maximum2016-12-27 00:00:00
2020-11-05T17:43:53.052454image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:53.301781image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
0
203 
1
96 
ValueCountFrequency (%) 
020367.9%
 
19632.1%
 
2020-11-05T17:43:53.476861image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2020-11-05T17:43:33.032626image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:33.218997image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:33.386703image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:33.599573image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:33.749160image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:33.879270image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:34.021289image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:34.161415image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:34.300636image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:34.447730image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:34.613935image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:34.794389image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:34.958209image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:35.114820image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:35.266102image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:35.450798image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:35.622373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:35.790058image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:35.949572image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:36.098522image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:36.258395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:36.405692image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:36.553707image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:36.688434image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:36.838603image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:36.978829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:37.123217image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:37.263951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:37.512462image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:37.657926image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:37.796286image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:37.923501image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:38.046071image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:38.185868image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:38.317310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:38.455261image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:38.591293image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:38.720512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:38.863968image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:38.994485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:39.118247image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:39.234692image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:39.364532image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:39.496477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:39.628750image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:39.752149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:39.894941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:40.051069image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:40.198105image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:40.337638image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:40.483623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:40.693850image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:40.840905image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:40.987710image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:41.128041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:41.313725image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:41.466581image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:41.620269image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:41.889982image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:42.170757image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:42.482285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:42.677343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:42.818083image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:42.956719image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:43.126174image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:43.452727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:43.859673image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:44.000553image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:44.160817image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:44.307647image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:44.453702image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:44.611206image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:44.758889image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:44.896736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:45.049255image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:45.193095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:45.326533image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:45.457033image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:45.606316image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:45.740965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:45.884163image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-11-05T17:43:53.650323image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-11-05T17:43:53.995557image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-11-05T17:43:54.374741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-11-05T17:43:54.715461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-11-05T17:43:46.251313image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-05T17:43:46.884092image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

ageanaemiacreatinine_phosphokinasediabetesejection_fractionhigh_blood_pressureplateletsserum_creatinineserum_sodiumsexsmokingtimeDEATH_EVENTDayMonthYearDateDEATH_EVENT_num
075.005820201265000.001.9130104YES4.0420162016-04-041
155.0078610380263358.031.1136106YES6.0420162016-04-061
265.001460200162000.001.3129117YES7.0420162016-04-071
350.011110200210000.001.9137107YES7.0420162016-04-071
465.011601200327000.002.7116008YES8.0420162016-04-081
590.01470401204000.002.1132118YES8.0420162016-04-081
675.012460150127000.001.21371010YES10.0420162016-04-101
760.013151600454000.001.11311110YES10.0420162016-04-101
865.001570650263358.031.51380010YES10.0420162016-04-101
980.011230351388000.009.41331110YES10.0420162016-04-101

Last rows

ageanaemiacreatinine_phosphokinasediabetesejection_fractionhigh_blood_pressureplateletsserum_creatinineserum_sodiumsexsmokingtimeDEATH_EVENTDayMonthYearDateDEATH_EVENT_num
28990.013370380390000.00.914400256NO12.01220162016-12-120
29045.006151550222000.00.814100257NO13.01220162016-12-130
29160.003200350133000.01.413910258NO14.01220162016-12-140
29252.001901380382000.01.014011258NO14.01220162016-12-140
29363.011031350179000.00.913611270NO26.01220162016-12-260
29462.00611381155000.01.114311270NO26.01220162016-12-260
29555.0018200380270000.01.213900271NO27.01220162016-12-270
29645.0020601600742000.00.813800278NO3.0120152015-01-030
29745.0024130380140000.01.414011280NO5.0120152015-01-050
29850.001960450395000.01.613611285NO10.0120152015-01-100